DOCUMENT RESUME 



J \ 






ED 418 992 



TM 028 276 



AUTHOR 

TITLE 

PUB DATE 
NOTE 



PUB TYPE 
EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Henard, David H. 

Using Spreadsheets To Implement the One- Parameter Item 
Response Theory (IRT) Model. 

1998-04-11 

3 8 p . ; Paper presented at the Annual Meeting of the 
Southwestern Psychological Association (New Orleans, LA, 
April 1998) . Small print in tables/figures may not reproduce 
clearly. 

Reports - Evaluative (142) -- Speeches/Meeting Papers (150) 

MF01/PC02 Plus Postage. 

Computer Oriented Programs; Computer Software; Heuristics; 
*Item Response Theory; ^Mathematical Models; *Spreadsheets ; 
*Test Items 

*One Parameter Model; *Rasch Model 



ABSTRACT 



Item response theory models arose from the inherent 
limitations of classical test theory methods of test analysis. A brief • 
description of those limitations and the corresponding enhancements provided 
by item response models is provided. Further, an examination of the popular 
Rasch one-parameter latent trait model is undertaken. Specific explanation of 
the step-by-step calculations in the one-parameter model is accomplished 
using a commonly available spreadsheet. This paper is designed to be used as 
a teaching heuristic to assist students in understanding both the mechanics 
and the rationale behind the item response theory model measurement. 

(Contains nine tables, four figures, and nine references.) (Author/SLD) 



* ★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★■a-***************************** 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 

********** ****************************************************************** *** * 



TM028276 



Using Spreadsheets to Implement the 
One-Parameter Item Response Theory (IRT) Model 



t 



N 

ON 

ON 

00 



O 

W 



David H. Henard 
Texas A&M University 
Mays Graduate School of Business 
College Station, TX 77843-4112 
409.845.5205 (phone) 
409.862.2811 (fax) 
henard@tamu.edu 



PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL 
HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 



U.S. DEPARTMENT OF EDUCATION 
Office of Educational Research and Improvement 

EDUCATIONAL RESOURCES INFORMATION 
CENTER (ERIC) 

ErThis document has been reproduced as 
received from the person or organization 
originating it. 

□ Minor changes have been made to 
improve reproduction quality. 



• Points of view or opinions stated in this 
document do not necessarily represent 
official OERI position or policy. 




Paper presented at the annual meeting of the Southwestern Psychological Association, 
New Orleans, April 11, 1998. 



Item Response Theory 1 



ABSTRACT 

Item response theory models arose from the inherent limitations of classical test 
theory methods of test analysis. A brief description of those limitations and the 
corresponding enhancements provided by item response models is provided. Further, an 
examination of the popular Rasch one-parameter latent trait model is undertaken. 
Specific explanation of the step-by-step calculations in the one-parameter model is 
accomplished using a commonly available spreadsheet. This paper is designed to be used 
as a teaching heuristic to assist students in understanding both the mechanics and the 
rationale behind the item response theory (IRT) model measurement. 



The author would like to express his thanks to Bruce Thompson for his comments on an 
earlier draft of this paper. 




3 



Item Response Theory 2 



When they were first introduced, item response theory (IRT)/latent trait 
measurement models were heralded as “one of the most important methodological 
advances in psychological measurement in the past half century” (McKinley and Mills 
1989, p. 71). However, the pluses and minuses of these models have been hotly debated 
(cf. Lawson 1991) despite their widespread use in various applications such as test 
equating, item selection and adaptive testing. 

This paper will begin with a brief examination of classical test theory and some of 
its inherent weaknesses. An encompassing examination of classical test theory is beyond 
the scope of the paper. Readers desiring greater discourse on the subject are directed to 
Crocker and Algina (1986) and Nunnelly and Bernstein (1994). The focus then shifts to 
item response theory with discussion centering on the theoretical framework of the Rasch 
one-parameter IRT model. The concepts underlying and the basic tenets of item response 
theory are explored. Finally, step-by-step calculations involved in the Rasch model will 
be explained using a commonly available spreadsheet. Such spreadsheets can be valuable 
heuristic devises to assist students in truly understanding what is occurring in IRT 
measurement. 



Classical Test Theory 

Classical test theory (CTT) and its related methods has a number of limitations. 
For example, comparison across examinees is limited to situations where the subjects of 
interest are administered the same (parallel) test items. Also, a false presumption of CTT 
is that the variance of errors of measurement is the same for all examinees. Reality 



i 



Item Response Theory 3 

dictates that some people perform tasks more consistently than others and that 
consistency varies with ability (Hambleton and Swaminathan 1985). 

Two major CTT limitations of note are: 

1 . Examinee characteristics cannot be separated from test characteristics, and 

2. CTT is test-oriented rather than item-oriented. 

The first limitation above can be summarized as a situation of circular 
dependency. The examinee statistic (i.e., observed score) is item sample-dependent while 
the item statistics (i.e., item difficulty, item discrimination) are examinee sample- 
dependent . Stated simply, when the test is ‘difficult’, examinees will appear to have lower 
ability and when the test is ‘easy’, they will appear to have higher ability. Likewise, the 
‘difficulty’ of a test item is determined by the proportion of examinees who answer it 
correctly and is thus dependent on the abilities of the examinees being measured 
(Hambleton, Swaminathan and Rogers 1991). This circular dependency poses some 
theoretical difficulties in CTT’s application in measurement situations such as test 
equating and computerized adaptive testing. 

The second major limitation listed is a question of orientation. The CTT model 
fails to allow us to predict how an examinee, given a stated ability level, is likely to 
respond to a particular item (Hambleton et al. 1991). Predicting how an individual 
examinee or a group of examinees will perform on a specific item is quite relevant to a 
number of testing applications. Consider the difficulties facing a test designer who wishes 
to predict test scores across multiple groups, or to design an equitable test for a particular 
group, or possibly to compare examinees who take either different tests or the same test at 
differing times. Such inherent limitations of CTT led psychometricians to develop models 




5 



Item Response Theory 4 



that overcame not only these limitations but also led to improved bias detection, enhanced 
reliability assessment and increased precision in ability measurement. Item response 
theory (Hambleton and Swaminathan 1985; Hambleton et al. 1991; Lord 1980) provides 
us with a framework to accomplish these desired features. 

Item Response Theory Concept 

Item response theory (IRT) arose out of a psychometric need to overcome the 
limitations of classical test theory and to provide test designers with improved and more 
accurate testing tools. Again, a thorough discussion of IRT is beyond the bounds of the 
present study. Interested readers are directed to Hambleton et al. (1991) and Wright and 
Stone (1979). IRT primarily rests upon two basic postulates (Hambleton et al. 1991; 
Hambleton and Swaminathan 1985): 

1 . The performance of an examinee on a test item can be explained (or predicted) 

by a set of factors called traits , latent traits or abilities', and 

2. The relationship between examinees’ item performance and the trait(s) 

underlying item performance can be described by a monotonically 
increasing function called the item characteristic curve (ICC). 

Several IRT models exist, including the three-parameter, two parameter and one- 
parameter models. The one-parameter model, often referred to as the Rasch model, is the 
most commonly used example and will be the focus of this paper. The models differ 
principally in the mathematical form of the ICC and/or the number of parameters 
specified in the model. 




6 



Item Response Theory 5 



When an IRT model fits the test data of interest, many of the limitations of CTT 
are resolved. For example, examinee latent trait estimates are theoretically no longer test- 
dependent and item indices are no longer group-dependent. Ability estimates derived 
from different groupings of items will be the same, barring measurement error, and item 
parameter estimates derived from different groups of examinees will also be the same, 
barring sampling error (Hambleton et al. 1991). 



Assumptions of Item Response Models 

Unidimensionality and local independence are two assumptions that are 
fundamental to IRT. The unidimensionality assumption requires that only one ability or 
latent trait is measured by the various items that make up the test. Intuitively, this 
assumption cannot be strictly satisfied due to the reality that multiple factors will 
normally impact the test taking performance by an examinee. Exogenous factors such as 
generic cognitive ability, test anxiety and motivation level are likely to impact test 
performance as well. In order for a set of test data to satisfy the assumption of 
unidimensionality, a ‘dominant’ factor influencing performance must be present 
(Hambleton et al. 1991). This dominant factor is referred to as the ability or latent trait 
measured by the test. 

The assumption of local independence requires that an examinee’s responses to 
the various items in a test are statistically independent of each other (Hambleton and 
Swaminathan 1985). This implies that an examinee’s response to any one item will not 
affect their response to any other item in the test. Simply put, the trait specified in the 
model is the only factor influencing the respondent’s answer to the test items (Hambleton 




7 



Item Response Theory 6 



et al. 1991) and one item does not hold clues for subsequent items. It is important to note 
that the assumption of local independence does not imply that the test items are 
uncorrelated across the total group of examinees (Lord and Novak 1968). Whenever there 
is variation among the examinees on the measured ability, positive correlations between 
pairs of items will result. However, item scores are uncorrelated at a fixed ability level 
(Hambleton and Swaminathan 1985). 

There are three primary advantages to using item response models (Hambleton 
and Swaminathan 1985): 

1 . Assuming the existence of a large pool of items each measuring the same latent 

trait, the estimate of an examinee’s ability is independent of a particular 
sample of test items that are administered to the examinee; 

2. Assuming the existence of a large population of examinees, the descriptors of a 

test item (e.g., item difficulty, item discrimination) are independent of the 
particular sample of examinees drawn for the purpose of item calibration; 
and 

3. A statistic indicating the precision with which each examinee’s ability is 

estimated is provided. 

Thus, the primary argument for employing IRT methods is that the resulting analyses are 
both person-free and sample-free measurements (McKinley and Mills 1989). It should be 
noted that not all researchers agree that IRT offers us such rich benefits. Lawson (1991) 
subjected three test data sets to both classical and Rasch procedures and found 
“remarkable similarities” between the results. Findings for both examinee abilities and 
item difficulties yielded “almost identical information.” Given the mathematical 
intricacies of IRT that are not required of classical methods, Lawson questioned the 
necessity of the Rasch procedure. That is, once misfitting items and people are removed 




3 



Item Response Theory 7 



from the analysis, IRT and CTT models seem to yield highly correlated person ability and 
item difficulty estimates. The Rasch model continues, however, to be utilized by 
psychometricians. The recent rise in adaptive testing bears testament to the continued use 
of IRT. 



THE RASCH MODEL CALCULATIONS 

Having noted the basic deficiencies of classical test theory and the improvements 
that the more theoretically-based item response theory provides us, attention is now 
focused on the step-by-step calculations in the one-parameter IRT measurement. The 
Rasch calculations can appear daunting to many students. While extremely powerful in 
its applications, the fundamentals of IRT are actually quite straightforward and should not 
be viewed as a black box process. It is hoped that the following discussion will facilitate 
the conceptual grasp of the subject. 

In the following data example, presume that 35 people were tested on an 18 item 
exam. Since the object of the item response model is to predict performance based on 
item calibrations that are independent of the persons generating the data (i.e., person free) 
and examinee ability estimates that are independent of the items used in the measurement 
(i.e., item free), all items that are answered either correctly or incorrectly by everyone will 
be removed from further analysis. Likewise, any person who answered either 0% or 100% 
of the items correctly will also be removed since neither can be calibrated against the 
group and thus provide us with no usable information. That is, such items and people 
provide no information to facilitate the estimation process (e.g., the person with all of the 
items correct may be exactly smart enough to do that, or may have any of the infinite 



Item Response Theory 8 



ability levels above the exact ability that is just sufficient to yield this perfect score. The 
resulting data set after this initial cut of the information can be seen in Table 1 . 

Insert Table 1 about here 

Table 1 is laid out so that examinees are sorted in increasing order of number of 
items answered correctly while items are sorted in increasing order of number of 
examinees that correctly answered the item. Since responses were dichotomously scored 
as either right or wrong, a ‘0’ in the table denotes an incorrect answer while a ‘ 1 ’ denotes 
a correct response. Looking at Table 1, examinee 25 answered the fewest number of 
questions correctly (2) while examinees 24, 34 and 7 answered the greatest number of 
items correctly (11). Remember that any examinee who scored perfectly 100 or zero has 
been removed. Any ‘perfect’ items have also been removed. In this data set, item 
numbers 1,2,3 and 18 were removed while examinee 35 was removed. This editing of the 
data continues in this manner until no ‘perfect’ items or persons remain. 

Given this initial editing of the data, the next step in the process is to calibrate 
both the item difficulties and the person abilities. In order for us to make valid 
assessments and predictions arising from the Rasch model, both of these statistics 
(difficulties and abilities) must be linear and in the same metric . In IRT, this is 
accomplished by converting the values into logits. Logits for item difficulties are 
calculated as the natural log of the proportion of items incorrect divided by the proportion 
correct. 1 Conversely, the logit calculation for person ability is the natural log of the 
proportion of items that an examinee correctly answered divided by the proportion 



Item Response Theory 9 



answered incorrectly. These conversions from proportions to logits can be seen in greater 
detail in Tables 2 and 3, respectively. 

Insert Tables 2 & 3 about here 

Once the logit values for both the persons and items are calculated, we have 
overcome another weakness associated with CTT. Namely, while item difficulty and 
person ability levels realistically range from negative infinity to positive infinity, the 
proportion correct/incorrect are bound by the values of zero and one. Conversion to logits 
transforms the values into a +/- 00 scale. One further step involves calculating the mean 
and standard deviation of the data and converting the logits to a standard (z) scale 
arbitrarily assigning a center point value of zero. While the scale theoretically runs from - 
00 to + °°, values realistically tend to vary between +/- 3 logits. Table 4 highlights the 
relationship between proportion (of correct responses) and personal ability logits for this 
data set. Additionally, Figure 1 graphically portrays the relationship between proportions 
and logits. 

Insert Table 4 and Figure 1 about here 

Two final calibration steps remain. The initial measurement of item difficulty 
must be corrected for the difficulty dispersion of the items. Additionally, the initial 
measurement of person ability needs to be corrected for the ability dispersion of persons. 
Calculations are modeled in Tables 5 and 6 and result in item calculations that are 
corrected for sample spread and person calculations that are corrected for test width. This 

2 /n[pi/(l-Pi)] 




il 



Item Response Theory 10 



step is known as the calculation of expansion factors and is crucial given the premise of 
one-parameter IRT that the achievement of any person on a given item is solely 
dependent upon that person’s ability and the difficulty of the specific item. 

Insert Tables 5 & 6 about here 

The final step in the modeling process is to fit the model, obtained via the 
preceding steps, to the data and evaluate the goodness of fit. One can not merely assume 
that the preceding steps are sufficient in developing an effective model. If we re-examine 
Table 1, a pattern in responses should emerge. Since items are ordered by increasing level 
of difficulty and examinees are progressively ranked according to correct responses, we 
would intuitively expect to see more incorrect responses to the top and right of the table 
and vice versa. In other words, we would expect a person who is higher on a latent trait 
(0, theta) to have a greater chance of answering a difficult question than a person who is 
lower on that latent trait. Table 7 accentuates the different ‘expectations’. 

While those persons or items that do not fit well with the model are statistically 
identified by the software program used to calculate the Rasch model, some of the 
potential misfits are circled here in order to visually highlight points where the model 
does not perfectly fit the data. It should be noted that both items and persons can be 
identified as aberrant. For example, person 13 answered item 12 correctly when the 
expectation (given other responses) would be that the item would be answered 
incorrectly. Also, items 6 and 8 were answered incorrectly by person 12 when the 
expectation would be a correct response. 




12 



Item Response Theory 1 1 



Insert Table 7 about here 

Once an IRT computer program identifies misfits between the model and the data, 
the source of the variance (item- or person-based) is explored. In examining Table 7, 
persons 13 and 29 appear to post responses that are aberrant to expectations. Likewise, 
items 6, 7, 8 and 12 do not fit with expectations. Each of these variants in the table are 
circled for easier identification. As mentioned previously, the source of the 
inconsistencies can originate at either the item or person level. Table 8 simulates how the 
software program would investigate the irregularities caused by persons 29 and 13, for 
example. 

Each item has a calculated difficulty level (d) and each person has a calculated 
ability level (theta). The first step in the analysis of fit is to determine the difference 
between the ability level and the difficulty level for each person and each item. Table 8 
highlights the values involved for persons 29 and 13. When the difference in the two 
values is a positive number, it is an indication that that particular item should be ‘easy’ 
for that particular examinee and should be answered correctly. The higher the number, the 
greater the likelihood of a correct response. Conversely, the more negative the difference, 
the greater the likelihood that the item difficulty exceeds the person’s ability. 

Insert Table 8 about here 

Looking at Table 8, it appears that person 29 missed item 7, when they should 
have theoretically answered it correctly, while correctly answering item 14, when the 
probability was that it would be missed by a person with a theta equal to zero. Person 13 




13 



Item Response Theory 12 



missed both items 6 and 7 while getting item 1 2 correct— all opposite of expectations. 
Remember, however, that the source of variance can result from item irregularities as 
well. Table 9 facilitates our understanding of how item response patterns are examined 
for misfitting results. The process is similar to the aforementioned one. This table 
illustrates the examination of items 6 and 7 for all persons. Again, the process occurs for 
all items and all persons. 



Both items 6 and 7 have fairly high negative logit values for item difficulty levels 
indicating that they should be answered correctly by most examinees. Indeed, an 
examination of the results in Table 9 shows that only three of the 34 examinees missed 
item 7 while only four missed item 6. It appears that it is not items 6 and 7 that are 
causing the irregularity between the model and the data but rather examinees 13 and 29. 
This is exactly what is occurring. The removal of these two persons from the data set 
eliminates most of the irregularity associated with the two items. 

Upon removal of persons 1 3 and 29, the process iterates and a new evaluation of 
fit is calculated for the remaining distributions. Again, all combinations of persons and 
items are examined. At the point at which no further removal of either items or persons 
enhances the goodness of fit, the model is said to “fit” the data and the result is items that 
are theoretically both unidimensional and independent. By eliminating both items and 
individuals that deviate from expectations, we can develop a test bank of items that 
should optimally fit the individual person ability levels for most test takers. 



Insert Table 9 about here 



ERIC 




Item Response Theory 1 3 



Figure 2 illustrates one-parameter item characteristic curves (ICC) for four 
hypothetical items. Latent ability (0) is represented, in logits, along the x-axis. The 
probability of a correct response is located on the y-axis. Since in the one-parameter IRT 
model no traits other than ability (e.g., guessing) are assumed to impact responses, the 
curves are asymptotic to the zero and one points of the probability distribution. The 
difficulty level of each item is defined as the logit point at which the probability of 
answering the item correctly is 50% (p = 0.50). Therefore, those items with curves that 
are toward the right side of x-axis are more difficult than those to the left. For example, 
the item difficulty for item 3 is -1.0 while the item difficulty for item 2 is approximately 
+2.0. Therefore, persons with an ability (0) equal to zero would probably answer item 3 
correctly and miss items 1 and 2. There is a 50% chance of the person answering item 4 
correctly. 

Insert Figure 2 about here 

Figures 3 and 4 are simply added as a point of comparison and for further 
edification. The two-parameter model assumes two parameters are affecting examinee 
responses: ability and item discrimination. Curve endpoints are still asymptotic as 

answers can only be correct or incorrect. With the two-parameter curve, the slope of the 
curve indicates how well the item differentiates between persons with varying latent 
abilities. For instance, item 2 in Figure 3 has a much flatter slope than that of item 4. 
Therefore, item 4 is a better discriminating item. 

Figure 4, the three-parameter model, adds a third variable to the equation— the 
effect of guessing. Here, the curve endpoint may begin at a value other than zero as the 




15 



Item Response Theory 14 



impact of correctly guessing an item is taken into account. The evaluation guidelines that 
applied to the other two ICCs apply here as well; however, the location of the initial 
endpoint gives the researcher an indication as to how effective item distracters may be. 
For example, items 3 and 6 appear to be potentially guessed correctly whereas items 2 
and 4 do not. 

Insert Figures 3 and 4 about here 



SUMMARY 

Item response theory models arose from the inherent limitations of classical test 
theory methods of test analysis. Chief among the limitations is that examinee 
characteristics can not be separated from test characteristics. Item response theory 
overcomes these limitations and rests on two major assumptions: (a) the performance of 
an examinee can be explained by a set of factors known as traits, and (b) the relationship 
between an individual’s item performance can be described by a monotonically increasing 
function termed an item characteristic curve. 

Item response theory allows the researcher to develop test questions that are 
theoretically both person-free and item-free. IRT stresses maximizing the test 
information function over the range of abilities that are of interest instead of maximizing 
reliability, as does classical psychometrics. While the usefulness of IRT continues to be 
debated, IRT appears to hold many benefits. Among these are a more accurate ability to 
detect item or test bias, the ability to administer customized, individualized, computer- 
adaptive tests and the ability to construct more effective tests, in general. It is hoped that 




16 



Item Response Theory 1 5 



this paper has facilitated a better understanding of both the mechanics and the rationale 
behind item response theory (IRT) measurement. 




17 



Item Response Theory 16 



REFERENCES 

r- V, 

Crocker, L. and J. Algina (1986), Introduction to Classical and Modern Test Theory. 
New York: Holt, Rinehart and Winston. 

Hambleton, Ronald K. and H. Swaminathan (1985), Item Response Theory: Principles 
and Applications. Boston: Kluwer. 

— , — , and H. Jane Rogers (1991), Fundamentals of Item Response Theory. Newbury 
Park: Sage Publications. 

Lawson, Stephen (1991), “One Parameter Latent Trait Measurement: Do the Results 
Justify the Effort?,” in Advances in Education Research: Substantive Findings, 
Methodological Developments, Bruce Thompson, ed. Greenwich: JAI Press. 

Lord, F. M. (1980), Applications of Item Response Theory to Practical Testing Problems. 
Hillsdale, NJ: Lawrence Erlbaum. 

— and M. R. Novak (1968), Statistical Theories of Mental Test Scores. Reading, MA: 
Addison-Wesley. 

McKinley, R. and C. Mills (1989), “Item Response Theory: Advances in Achievement 

and Attitude Measurement,” in Advances in Social Science Methodology, Bruce 
Thompson, ed. Greenwich: JAI Press. 

Nunnally, Jum C. and Ira H. Bernstein (1994), Psychometric Theory. New York: 
McGraw-Hill, Inc. 

Wright, Benjamin D. and Mark H. Stone (1979), Best Test Design. Chicago: MESA 
Press. 




18 



Table 1 

Edited Responses of 34 Examinees 



Item Response Theory 1 7 



O 



E 

a 



ooooooooooooooooooooooooooooSSSSSSS 






o*o*o\o\2ZZZ 




S»R-R:2S2SN«««.*sSS!!?l;2!!a8sai!U2'>S J 



II 




O- 



Pof34 0 94 0 91 0 91 0 88 0 08 0 79 0 71 0 3J 0 21 0 18 0 09 0 03 0 03 0 03 



Item Response Theory 1 8 



u 

© 

o 

ca 

S 

n Z 

.o o 
* 6 
H o 



£ 

u 

5 



(N 




X 








S' 

£ 

§ 

OJ 

S p 

«L TJ 


oppoooooo — 
©odd ©©©odd 




3) 


rvicvicvi — — q-U— 

• Ol 










X 




* 


— oer-p^oof^^'P 


& 

a 


S2*-d6-N^jfi 


£ 






<?s*o^<Nr^r^<Nr^*oSn 

^fl^OOOf^r^OOf^^^ 






K^4-do-(S<ri2 


Freq * 
Logit 


<?TTT9©~-r4® 


O 




■a 




i? 


CSSS 58-^^ vp 






flu 




flu 


oo-<s^»»^“9 


1 


00 0*00 — ri ^ 2 5 


flu 


unae — oo^rorN — o 

pp-<sN«poe^ON 


1 




dddddddddo 


flu 


J-OO^O^OPOON 

oo'oor^r^r^cs— "O© 




dddddddddo 


t 


— <n <n — — — — — — m 


Lu 




1 

a 


32 

31 

30 

27 

24 

12 

7 

6 

3 

1 


J 

a 


t" ®\ ^ 


J 


•»^*«oac2 = 222«s, 


8 




§ 


-(Nm^^yOPOO^S 


flU 









UJ 

—a 

m 



al 

o 

C3 

I — 
CO 
LU 



It 

Z 



C\j 

cv 







Item Response Theory 19 



Q fl 

4> 

U 

o 
to 

e 
o 

2 
v 
u O- 






*" § 



b 

Q 



1 

r 

Lb 



* 

Lb 



b 

0 

V 

1 



§ 



a 

a* 

f 

Lb 



a* 



i 

0 

Z 

<75 

1 



O'O^v — OOp^fNOsOO 
— — oo^r-p 

n n vo so n o ^ w oe w o 

m‘ cn — i © O © © — © rn © © 



^O0©Ov^m©mmO00O^ 
<n ( N<OOOMOOOnciO^NW 
so rn — O O O © O* O* O* -i rn sd 



8 ? 8 S 28 g?sgS 88 

o 7 ^ 7 7 9 6 - n 6 n 6 6 



^ M ^ N g S S g 



“T T 9 9 9 O' O' o' 



O Ov so 
m r* ^ 

O* — • — • rsi 



r** m q so O 

so r-* o wn 

O •— <s in r* 



o o o o o 



m oe 
d — ; — ; — ; 



3 



^OOhh'i'Oin^nNN - o 
o' o' o' o' o' o' o’ o' o' o' o' o' o' 



f'-^p-iOS'Or^Or’-^-^OsvOfn 

o' o o’ o o' o’ o’ o’ o' o' o' o’ o' 



s> 



O) - 



2 IA ^ - 



-**■ m so r* 



os 2 S 2 



LU 






o~ 

CD 

CD 

I— 

CO 

LU 

QQ 



cv 



II 

-J 



07) 

cv 




Table 4 

Logit Conversion Chart 



Item Response Theory 20 



CD 

CV 






5 



<N<NCMCMCMCMCMrn 



o\ 

00 



3 



r 

S 



is^sssassssasssjgssgsisigg: 

oooooooooddddddddddddddo 



UJ 

CQ 



! 



S 25 <N VO O ^ 00 

O — — rs rs rs 

© © © © © o o 



N ^ - <n 
n ^ ^ 

dodo 



VA *0 

o o 



2 



vO — *A © tA 

« h h a oo 

o o o o d 



8 5 

d o 



O <A O 
Ov O — 

o ~ ~ 



>- 

Du 

O 

C 3 



r 

§ 



^SRSSS!;SSS;SS 3 SS 5 SS^P!:?P 

ooooooodddddddddddddddddo 



CO 

UJ 



I ? 



99 9 9 9 99 9 9999 9 999 



oo *r 



£ 2 § 3 8 
99999996 



a 

.2 

r 

2 



<NMS 2 SwSI 5 )J 125 Di 222 - Nn 2 ‘ AvO ^ 00ON o 

oodddddddddddddddddddddod 



1 



ss?sssic?!?ssss:ssP8SK5sa?:s22 

T 999 ^^<><>^TTT-r-r-r-r-r-r-r-r-r-r-r 



o o o 



2 2 8 



° 9 ° “"""““----nninmnn 

ooooooooooooodddddddddddo 



m 

cv 




Tables 

Final Estimates of Item Difficulty 



Item Response Theory 2 1 




Item Response Theory 22 



O 

CO 



VO 

ju 

2 




C/3 



u 



c^©N©^r^m<Nmr^^'0©r^ 
- i «nN-----.NSS- 



5 3 



VO 00 — OS 
h 00 N “ 



m <N — — 



!«SS!n8^ 



S 5 



9 6 6 - - ri r^‘ v* 



on ON ^ 

© © © © 
<N<N<N<N<N 



s s s 



ON ON ON 
© © O 



s s 



<N<N<N<N<N<N<N 



ssssssgsasssa 

*? T T 9 9 9 o o o o' — — <n 



n 

.j 



LU 

CO 



S 

>- 

O- 

CD 

CJ> 

t— 

CO 

uu 




2 

J 



— '<Nn^v)^hoo^®Z22 



CD 

cv 




Table 7 

Response Patterns of Persons to Items 



Item Response Theory 23 




II 






© 

© 

© 

© 









■©■ 



■©■ 



■©■ 



--© 

- r©@©£ a 



Sj 



sssssasssssa 

OOOOOOOOOOOO 






.© ©. 



■©■ 



•®- 



-©---©- 



~ ©^ * • •©2 * a®« 



'O«'O«' 0 nnnn>i» 

ooobo — — — — -ir« 



OOOOOOOOOOOO 



assass-ss 



- 3 a 



S 8 



C S o 

” b « 



a a a 



a a s 

o «*« 



S« X 8 



5 s 



I 2 I 



I 



CV 

00 



_! 

CO 



<C 

CL, 

o 

C3 

5 — 

CO 



co 



Item Response Theory 24 





Item Response Theory 25 



Table 9 

Fit Analysis for Items 6 and 7 



Item 7 Item 6 

H ~ '2JL <*-•}■») 









x - u 


X - I 






x - 0 


x- 1 




Person 


Ability 

(them) 


Response 


(thetebd) 


(d-thtta) 


z 1 


Response 


(theta-d) 


(d-thtta) 




25 


-3.8 


0 


-0.5 




1 


1 




0.9 


3 


4 


-2.8 


1 




-0.5 


1 


0 


0.1 




1 


33 


-2.8 


1 




-0.3 


1 


0 


0.1 




1 


1 


-1.9 


1 




•1.4 


0 


1 




•1 


0 


27 


-1.9 


1 




•1.4 


0 


1 




•1 


0 


11 


-1.2 


1 




•2.1 


0 


1 




•1.7 


0 


12 


-1.2 


1 




•2.1 


0 


0 


1.7 




6 


17 


-0.6 


1 




•2.7 


0 


1 




•23 


0 


19 


-0.6 


1 




•2.7 


0 


1 




•2.3 


0 


30 


-0.6 


1 




•2.7 


0 


1 




•2.3 


0 


2 


0 


1 




•3.3 


0 


1 




•2.9 


0 


3 


0 


1 




•3.3 


0 


1 




-2.9 


0 


S 


0 


1 




•3.3 




1 




-2.9 


0 


6 


0 


1 




•3.3 


0 


1 




•2.9 


0 


8 


0 


1 




-3.3 


0 


1 




•2.9 


0 


9 


0 


1 




-3.3 


0 


1 




•2.9 


0 


© 


0 


0 


3.3 




27 


0 


2.9 




18 


16 


0 


1 




•3.3 


0 


1 




•2.9 


0 


26 


0 


1 




•3.3 


0 


1 




•2.9 


0 


28 


0 


1 




•3.3 


0 


1 




•2.9 


0 




0 


0 


3.3 




27 


1 




•2.9 


0 


31 


0 


1 




•3.3 


0 


1 




-2.9 


0 


10 


0.6 


1 




•3.9 


0 


1 




•3.5 


0 


18 


0.6 


1 




•3.9 


0 


1 




•3.5 


0 


14 


0.6 


1 




•3.9 


0 


1 




•3.5 


0 


32 


0.6 


1 




•3.9 


0 


1 




•3.5 


0 


20 


0.6 


1 




•3.9 


0 


1 




•3.5 


0 


21 


1.2 


1 




-4.5 


0 


1 




-4.1 


0 


22 


1.2 


1 




-4.3 


0 


1 




-4.1 


0 


23 


1.2 


1 




-4.5 


0 


1 




-4.1 


0 


34 


1.2 


1 




-4.3 


0 


1 




-4.1 


0 


13 


1.9 


1 




•5.2 


0 


1 




-4.8 


0 


7 


2.8 


1 




-6.1 


0 


1 




•5.7 


0 


24 


2.8 


1 




-6.1 


0 


1 




•5.7 


0 



Sum of Squares 



37 



29 




35 



Figure 1 

Scatterplot of Proportions to Logits 



Item Response Theory 26 





Figure 2 

One-Parameter Model 



(tem Response Theory 27 




AMKy 



Figure 3 

Two-Parameter Model 




AMHty 



Figure 4 

Three-Parameter Model 





UJL DEPARTMENT OF EDUCATION 

Ottlcm o/ Educational Raaaarch and Improvammnt (OERI) 
Educational Raaourcaa Information Cantar (ERIC) 

REPRODUCTION RELEASE 

(Specilic Document) 




DOCUMENT IDENTIFICATION: 



Title: 



USING SPREADSHEETS TO IMPLEMENT THE ONE-PARAMETER 
ITEM RESPONSE THEORY (IRT) MODEL 



Autnorts). 



DAVID H. HENARD 



Corporate Source: 



Publication Oate 

4/11/98 



II. 



REPRODUCTION RELEASE: 



In oroer to disseminate as widely as possible timeiy and significant materials ot interest to tne educational community, documents 
announced in tne mommy abstract lournai ot tne ERIC system. Resources in Education fRlEi. are usually made available to users 
in rmcroticne. reproduced oaoer copy, and eiectronic/ooticai media, and sold tnrougn tne ERIC Document Reproduction Service 
(EDRS) or otner ERIC vendors Credit is given to tne source ot eacn document, and. it reproduction release is granted, one ot 
tne following notices is atlixeo to tne document. 

It permission is granted to reproduce tne identified document, please CHECK ONE ot tne following options and sign tne release 

below 



Semple sticker to be elttxed to document Sample sticker to be affixed to document 



Check here 

Permitting 
microticne 
(4"x 6*’ film), 
paper copy, 
electronic, 
and optical media 
reproduction 



PERMISSION TO REPRODUCE THIS 1 




•PERMISSION TO REPRODUCE THIS 


MATERIAL HAS BEEN GRANTED BY 




MATERIAL IN OTHER than PAPER 


DAVID H. HENARD 




.COPY HAS BEEN GRANTEO BY 


TO THE EDUCATIONAL RESOURCES 




y 

TO THE EDUCATIONAL RESOURCES 


INFORMATION CENTER (ERICV" 




INFORMATION CENTER (ERICV" 


LMl 1 




Law* 2 



*□ 

or here 

Permitting 
reproduction 
in otner tnan 
paper copy. 



Sign Here, Please 

Docu m e n ts will be processed as indicated orovideo reproduction duality permits. It permission to reproduce ts granted, but 
neitner oox is cneckeo. documents will be processed at Level t. 



~l nerepy grant to tne Educational Resources information Center fERlO nonexclusive permission to reoroduce mis document as | 

indicateo above. Reproduction from tne ERIC microticne or eiectromc/ooticat media oy persons otner tnan ERIC employees and its 
system contractors reauires permission from tn^opyngnt noioer. Exceotion is made tor non profit reproduction by libraries ana otner 
service agencies to satisfy information needs ot eduty*tQ{s in response to discrete inouiries. ’ 




~ PWI,10n RES ASSOCIATE 


''TfSjfJP H . HENARD / 7 


0f ° ana ¥ < Sk AS A&M UNIVERSITY 


Address: 

TAMU GRAD SCH OF BUSINESS 
COLLEGE STATION, TX 77843-4112 


leieonone Numoar . „ _ _ _ 

< 409 )845-1335 


Date: 

3/23/98 




